Target-oriented phone selection from universal phone set for spoken language recognition
نویسندگان
چکیده
This paper studies target-oriented phone selection strategy for constructing phone tokenizers in the Parallel Phone Recognizers followed by Vector Space Model (PPR-VSM) paradigm of spoken language recognition. With this phone selection strategy, one derives a set of target-oriented phone tokenizers (TOPT), each having a subset of phones that have high discriminative ability for a target language. Two phone selection methods are proposed to derive such phone subsets from a phone recognizer. We show that the TOPTs derived from a universal phone recognizer (UPR) outperform those derived from language specific phone recognizers. The TOPT front-end derived from a UPR also consistently outperforms the UPR front-end without involving additional acoustic modeling. We achieve an equal error rates (EERs) of 1.33%, 1.75% and 2.80% on NIST 1996, 2003 and 2007 LRE databases respectively for 30 second closed-set tests by including multiple TOPTs in the PPR.
منابع مشابه
Towards High Performance Phonotactic Feature for Spoken Language Recognition
With the demands of globalization, multilingual speech is increasingly common in conversational telephone speech, broadcast news and internet podcasts. Therefore, automatic spoken language recognition has become an important technology in multilingual speech related applications. For example, automatic spoken language recognition has been used as a preprocessing component for spoken language tr...
متن کاملTarget-aware language models for spoken language recognition
This paper studies a new way of constructing multiple phone tokenizers for language recognition. In this approach, each phone tokenizer for a target language will share a common set of acoustic models, while each tokenizer will have a unique phone-based language model (LM) trained for a specific target language. The target-aware language models (TALM) are constructed to capture the discriminati...
متن کاملمقایسه روش های طیفی برای شناسایی زبان گفتاری
Identifying spoken language automatically is to identify a language from the speech signal. Language identification systems can be divided into two categories, spectral-based methods and phonetic-based methods. In the former, short-time characteristics of speech spectrum are extracted as a multi-dimensional vector. The statistical model of these features is then obtained for each language. The ...
متن کاملHigh-resolution acoustic modeling and compact language modeling of language-universal speech attributes for spoken language identification
We propose a framework to automatically construct a collection of high-resolution (HR) language-universal units for spoken language identification (LID). Based on the popular phone recognition language modeling (PRLM) approach to LID, a set of universal attribute recognizers (UARs) is first established to replace phone recognizers (PRs) using manner and place of articulation as attribute units ...
متن کاملExploring universal attribute characterization of spoken languages for spoken language recognition
We propose a novel universal acoustic characterization approach to spoken language identification (LID), in which any spoken language is described with a common set of fundamental units defined “universally.” Specifically, manner and place of articulation form this unit inventory and are used to build a set of universal attribute models with data-driven techniques. Using the vector space modeli...
متن کامل